# OpenCLIP ## Docs - [Custom Models](https://mintlify.wiki/mlfoundations/open_clip/advanced/custom-models.md): Create custom CLIP model architectures with custom vision and text encoders - [Gradient Accumulation](https://mintlify.wiki/mlfoundations/open_clip/advanced/gradient-accumulation.md): Simulate larger batch sizes with gradient accumulation to overcome memory limitations - [Int8 Support](https://mintlify.wiki/mlfoundations/open_clip/advanced/int8-support.md): Train and run inference with 8-bit quantization for improved speed and reduced memory usage - [Model Distillation](https://mintlify.wiki/mlfoundations/open_clip/advanced/model-distillation.md): Distill knowledge from larger CLIP models into smaller, more efficient models - [Push to Hub](https://mintlify.wiki/mlfoundations/open_clip/advanced/push-to-hub.md): Upload trained CLIP models to Hugging Face Hub for sharing and deployment - [Remote Training](https://mintlify.wiki/mlfoundations/open_clip/advanced/remote-training.md): Train models with automatic cloud storage backup and resume from remote checkpoints - [AugmentationCfg](https://mintlify.wiki/mlfoundations/open_clip/api/augmentation-cfg.md): Configuration dataclass for image augmentation during training - [build_zero_shot_classifier](https://mintlify.wiki/mlfoundations/open_clip/api/build-zero-shot-classifier.md): Build zero-shot classifier weights from class names and text templates - [ClipLoss](https://mintlify.wiki/mlfoundations/open_clip/api/clip-loss.md): Contrastive loss function for CLIP model training with distributed support - [CLIP](https://mintlify.wiki/mlfoundations/open_clip/api/clip-model.md): Core CLIP model implementation for contrastive learning between images and text - [CoCaLoss](https://mintlify.wiki/mlfoundations/open_clip/api/coca-loss.md): Combined contrastive and captioning loss for training CoCa (Contrastive Captioner) models - [CoCa](https://mintlify.wiki/mlfoundations/open_clip/api/coca-model.md): Contrastive Captioner model combining CLIP-style contrastive learning with autoregressive text generation - [create_model](https://mintlify.wiki/mlfoundations/open_clip/api/create-model.md): Creates and configures a contrastive vision-language model - [create_model_and_transforms](https://mintlify.wiki/mlfoundations/open_clip/api/create-model-and-transforms.md): Creates a model with preprocessing transforms for training and validation - [create_model_from_pretrained](https://mintlify.wiki/mlfoundations/open_clip/api/create-model-from-pretrained.md): Creates a model from pretrained weights with optional preprocessing transform - [CustomTextCLIP](https://mintlify.wiki/mlfoundations/open_clip/api/custom-text-clip.md): CLIP variant with a separately constructed text encoder tower for maximum flexibility - [decode](https://mintlify.wiki/mlfoundations/open_clip/api/decode.md): Convert token IDs back to text strings - [get_pretrained_cfg](https://mintlify.wiki/mlfoundations/open_clip/api/get-pretrained-cfg.md): Get configuration for a specific pretrained model and tag - [get_tokenizer](https://mintlify.wiki/mlfoundations/open_clip/api/get-tokenizer.md): Get the appropriate tokenizer for a CLIP model - [image_transform_v2](https://mintlify.wiki/mlfoundations/open_clip/api/image-transform.md): Create image preprocessing transforms with configuration objects - [list_models](https://mintlify.wiki/mlfoundations/open_clip/api/list-models.md): Enumerate available model architectures - [list_pretrained](https://mintlify.wiki/mlfoundations/open_clip/api/list-pretrained.md): List all available pretrained model/weight combinations - [load_checkpoint](https://mintlify.wiki/mlfoundations/open_clip/api/load-checkpoint.md): Load a checkpoint into an existing model - [PreprocessCfg](https://mintlify.wiki/mlfoundations/open_clip/api/preprocess-cfg.md): Configuration dataclass for image preprocessing - [SigLipLoss](https://mintlify.wiki/mlfoundations/open_clip/api/siglip-loss.md): Sigmoid-based contrastive loss for language-image pre-training with improved efficiency and performance - [tokenize](https://mintlify.wiki/mlfoundations/open_clip/api/tokenize.md): Tokenize text strings for CLIP models - [Zero-Shot Metadata](https://mintlify.wiki/mlfoundations/open_clip/api/zero-shot-metadata.md): Pre-defined templates and class names for zero-shot classification - [CLIP Overview](https://mintlify.wiki/mlfoundations/open_clip/concepts/clip-overview.md): Understanding the CLIP architecture and how it learns visual-semantic representations - [Contrastive Learning](https://mintlify.wiki/mlfoundations/open_clip/concepts/contrastive-learning.md): Understanding how CLIP learns visual-semantic embeddings through contrastive loss - [Zero-Shot Classification](https://mintlify.wiki/mlfoundations/open_clip/concepts/zero-shot-classification.md): How CLIP performs classification on unseen categories without fine-tuning - [Benchmark Results](https://mintlify.wiki/mlfoundations/open_clip/evaluation/benchmarks.md): Comprehensive benchmark results for OpenCLIP models across 38 evaluation datasets - [Evaluation Metrics](https://mintlify.wiki/mlfoundations/open_clip/evaluation/metrics.md): Understanding the metrics used to evaluate OpenCLIP models - [Zero-Shot Evaluation](https://mintlify.wiki/mlfoundations/open_clip/evaluation/zero-shot.md): Learn how to evaluate OpenCLIP models on zero-shot tasks during and after training - [Installation](https://mintlify.wiki/mlfoundations/open_clip/installation.md): Install OpenCLIP for inference, training, or development - [Introduction](https://mintlify.wiki/mlfoundations/open_clip/introduction.md): OpenCLIP is an open-source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training) for training and using vision-language models. - [Quickstart](https://mintlify.wiki/mlfoundations/open_clip/quickstart.md): Get started with OpenCLIP in minutes - load a model and perform zero-shot image classification - [CoCa Training](https://mintlify.wiki/mlfoundations/open_clip/training/coca.md): Train CoCa (Contrastive Captioner) models for image captioning and contrastive learning - [Training Configuration](https://mintlify.wiki/mlfoundations/open_clip/training/configuration.md): Complete reference for all OpenCLIP training parameters and hyperparameters - [Data Preparation](https://mintlify.wiki/mlfoundations/open_clip/training/data-preparation.md): Prepare training data in CSV and WebDataset formats for CLIP training - [Distributed Training](https://mintlify.wiki/mlfoundations/open_clip/training/distributed-training.md): Advanced techniques for efficient large-scale CLIP training - [Fine-tuning](https://mintlify.wiki/mlfoundations/open_clip/training/fine-tuning.md): Fine-tune pretrained CLIP models for improved performance on specific tasks - [Multi-Node Training](https://mintlify.wiki/mlfoundations/open_clip/training/multi-node.md): Scale CLIP training across multiple machines with torchrun and SLURM - [Training Overview](https://mintlify.wiki/mlfoundations/open_clip/training/overview.md): Overview of training CLIP models with OpenCLIP - [Single-Node Training](https://mintlify.wiki/mlfoundations/open_clip/training/single-node.md): Train CLIP models on a single machine with multiple GPUs - [Image Preprocessing](https://mintlify.wiki/mlfoundations/open_clip/usage/image-preprocessing.md): Configure image transformations, normalization, augmentation, and preprocessing for OpenCLIP models - [Inference](https://mintlify.wiki/mlfoundations/open_clip/usage/inference.md): Complete guide to running inference with OpenCLIP models including image/text encoding, similarity computation, and batch processing - [Loading Models](https://mintlify.wiki/mlfoundations/open_clip/usage/loading-models.md): Learn how to load OpenCLIP models from various sources including pretrained weights, HuggingFace Hub, and local directories - [Pretrained Models](https://mintlify.wiki/mlfoundations/open_clip/usage/pretrained-models.md): Complete catalog of pretrained OpenCLIP models with performance metrics, training details, and usage examples - [Tokenization](https://mintlify.wiki/mlfoundations/open_clip/usage/tokenization.md): Text tokenization for OpenCLIP models including SimpleTokenizer, HFTokenizer, and SigLipTokenizer with context length handling ## OpenAPI Specs - [openapi](https://mintlify.wiki/mlfoundations/open_clip/api-reference/openapi.json)